Search CORE

39 research outputs found

Blending big data analytics : review on challenges and a recent study

Author: Amalina Fairuz
Azizul Zati
Fong Ang
Imran Muhammad
Targio Hashem Ibrahim
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

With the collection of massive amounts of data every day, big data analytics has emerged as an important trend for many organizations. These collected data can contain important information that may be key to solving wide-ranging problems, such as cyber security, marketing, healthcare, and fraud. To analyze their large volumes of data for business analyses and decisions, large companies, such as Facebook and Google, adopt analytics. Such analyses and decisions impact existing and future technology. In this paper, we explore how big data analytics is utilized as a technique for solving problems of complex and unstructured data using such technologies as Hadoop, Spark, and MapReduce. We also discuss the data challenges introduced by big data according to the literature, including its six V's. Moreover, we investigate case studies of big data analytics on various techniques of such analytics, namely, text, voice, video, and network analytics. We conclude that big data analytics can bring positive changes in many fields, such as education, military, healthcare, politics, business, agriculture, banking, and marketing, in the future. © 2013 IEEE

Federation ResearchOnline

Real-time big data processing for anomaly detection : a survey

Author: Ahmed Ejaz
Ariyaluran Habeeb Riyaz
Gani Abdullah
Imran Muhammad
Nasaruddin Fariza
Targio Hashem Ibrahim
Publication venue: Elsevier Ltd
Publication date: 01/01/2019
Field of study

The advent of connected devices and omnipresence of Internet have paved way for intruders to attack networks, which leads to cyber-attack, financial loss, information theft in healthcare, and cyber war. Hence, network security analytics has become an important area of concern and has gained intensive attention among researchers, off late, specifically in the domain of anomaly detection in network, which is considered crucial for network security. However, preliminary investigations have revealed that the existing approaches to detect anomalies in network are not effective enough, particularly to detect them in real time. The reason for the inefficacy of current approaches is mainly due the amassment of massive volumes of data though the connected devices. Therefore, it is crucial to propose a framework that effectively handles real time big data processing and detect anomalies in networks. In this regard, this paper attempts to address the issue of detecting anomalies in real time. Respectively, this paper has surveyed the state-of-the-art real-time big data processing technologies related to anomaly detection and the vital characteristics of associated machine learning algorithms. This paper begins with the explanation of essential contexts and taxonomy of real-time big data processing, anomalous detection, and machine learning algorithms, followed by the review of big data processing technologies. Finally, the identified research challenges of real-time big data processing in anomaly detection are discussed. © 2018 Elsevier Lt

ZENODO

Federation ResearchOnline

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Profiling users' behavior, and identifying important features of review 'helpfulness'

Author: Abdullah Gani
Ibrahim Abaker Targio Hashem
Mohsen Marjani
Muhammad Bilal
Muhammad Ikramullah Lali
Nadia Malik
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

The increasing volume of online reviews and the use of review platforms leave tracks that can be used to explore interesting patterns. It is in the primary interest of businesses to retain and improve their reputation. Reviewers, on the other hand, tend to write reviews that can influence and attract people’s attention, which often leads to deliberate deviations from past rating behavior. Until now, very limited studies have attempted to explore the impact of user rating behavior on review helpfulness. However, there are more perspectives of user behavior in selecting and rating businesses that still need to be investigated. Moreover, previous studies gave more attention to the review features and reported inconsistent findings on the importance of the features. To fill this gap, we introduce new and modify existing business and reviewer features and propose a user-focused mechanism for review selection. This study aims to investigate and report changes in business reputation, user choice, and rating behavior through descriptive and comparative analysis. Furthermore, the relevance of various features for review helpfulness is identified by correlation, linear regression, and negative binomial regression. The analysis performed on the Yelp dataset shows that the reputation of the businesses has changed slightly over time. Moreover, 46% of the users chose a business with a minimum of 4 stars. The majority of users give 4-star ratings, and 60% of reviewers adopt irregular rating behavior. Our results show a slight improvement by using user rating behavior and choice features. Whereas, the significant increase in R2 indicates the importance of reviewer popularity and experience features. The overall results show that the most significant features of review helpfulness are average user helpfulness, number of user reviews, average business helpfulness, and review length. The outcomes of this study provide important theoretical and practical implications for researchers, businesses, and reviewers

UMS Institutional Repository

The role of big data in smart city

Author: Adewole Kayode
Ahmed Ejaz
Anuar Nor Badrul
Chang Victor
Chiroma Haruna
Gani Abdullah
Hashem Ibrahim Abaker Targio
Yaqoob Ibrar
Publication venue: 'Elsevier BV'
Publication date: 01/10/2016
Field of study

The expansion of big data and the evolution of Internet of Things (IoT) technologies have played an important role in the feasibility of smart city initiatives. Big data offer the potential for cities to obtain valuable insights from a large amount of data collected through various sources, and the IoT allows the integration of sensors, radio-frequency identification, and Bluetooth in the real-world environment using highly networked services. The combination of the IoT and big data is an unexplored research area that has brought new and interesting challenges for achieving the goal of future smart cities. These new challenges focus primarily on problems related to business and technology that enable cities to actualize the vision, principles, and requirements of the applications of smart cities by realizing the main smart environment characteristics. In this paper, we describe the existing communication technologies and smart-based applications used within the context of smart cities. The visions of big data analytics to support smart cities are discussed by focusing on how big data can fundamentally change urban populations at different levels. Moreover, a future business model that can manage big data for smart cities is proposed, and the business and technological research challenges are identified. This study can serve as a benchmark for researchers and industries for the future progress and development of smart cities in the context of big data

Southampton (e-Prints Soton)

Crossref

A Survey on Underwater Wireless Sensor Networks: Requirements, Taxonomy, Recent Advances, and Open Research Challenges

Author: Abdullah Gani
Ibrahim Abaker Targio Hashem
Ismail Ahmedy
Mohd Yamani Idna Idris
Salmah Fattah
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

The domain of underwater wireless sensor networks (UWSNs) had received a lot of attention recently due to its significant advanced capabilities in the ocean surveillance, marine monitoring and application deployment for detecting underwater targets. However, the literature have not compiled the state-of-the-art along its direction to discover the recent advancements which were fuelled by the underwater sensor technologies. Hence, this paper offers the newest analysis on the available evidences by reviewing studies in the past five years on various aspects that support network activities and applications in UWSN environments. This work was motivated by the need for robust and flexible solutions that can satisfy the requirements for the rapid development of the underwater wireless sensor networks. This paper identifies the key requirements for achieving essential services as well as common platforms for UWSN. It also contributes a taxonomy of the critical elements in UWSNs by devising a classification on architectural elements, communications, routing protocol and standards, security, and applications of UWSNs. Finally, the major challenges that remain open are presented as a guide for future research directions

UMS Institutional Repository

JQPro : Join query processing in a distributed system for big RDF data using the hash-merge join technique

Author: Abulfaraj Anas W.
Ashraf Osman Ibrahim
Binzagr Faisal
Elzein Nahla Mohammed
Hashem Ibrahim Abaker Targio
Mazlina Abdul Majid
Publication venue: MDPI
Publication date: 01/03/2023
Field of study

In the last decade, the volume of semantic data has increased exponentially, with the number of Resource Description Framework (RDF) datasets exceeding trillions of triples in RDF repositories. Hence, the size of RDF datasets continues to grow. However, with the increasing number of RDF triples, complex multiple RDF queries are becoming a significant demand. Sometimes, such complex queries produce many common sub-expressions in a single query or over multiple queries running as a batch. In addition, it is also difficult to minimize the number of RDF queries and processing time for a large amount of related data in a typical distributed environment encounter. To address this complication, we introduce a join query processing model for big RDF data, called JQPro. By adopting a MapReduce framework in JQPro, we developed three new algorithms, which are hash-join, sort-merge, and enhanced MapReduce-join for join query processing of RDF data. Based on an experiment conducted, the result showed that the JQPro model outperformed the two popular algorithms, gStore and RDF-3X, with respect to the average execution time. Furthermore, the JQPro model was also tested against RDF-3X, RDFox, and PARJs using the LUBM benchmark. The result showed that the JQPro model had better performance in comparison with the other models. In conclusion, the findings showed that JQPro achieved improved performance with 87.77% in terms of execution time. Hence, in comparison with the selected models, JQPro performs better

UMP Institutional Repository

Indigenous food recognition model based on various convolutional neural network architectures for gastronomic tourism business analytics

Author: Chong Joon Hou
Ervin Gubin Moung
Farashazillah Yahya
Ibrahim Abakr Targio Hashem
Mohd Norhisham Razali @ Ghazali
Raihani Mohamed
Rozita Hanapi
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

In gastronomic tourism, food is viewed as the central tourist attraction. Specifically, indigenous food is known to represent the expression of local culture and identity. To promote gastronomic tourism, it is critical to have a model for the food business analytics system. This research undertakes an empirical evaluation of recent transfer learning models for deep learning feature extraction for a food recognition model. The VIREO-Food172 Dataset and a newly established Sabah Food Dataset are used to evaluate the food recognition model. Afterwards, the model is implemented into a web application system as an attempt to automate food recognition. In this model, a fully connected layer with 11 and 10 Softmax neurons is used as the classifier for food categories in both datasets. Six pre-trained Convolutional Neural Network (CNN) models are evaluated as the feature extractors to extract essential features from food images. From the evaluation, the research found that the EfficientNet feature extractor-based and CNN classifier achieved the highest classification accuracy of 94.01% on the Sabah Food Dataset and 86.57% on VIREO-Food172 Dataset. EFFNet as a feature representation outperformed Xception in terms of overall performance. However, Xception can be considered despite some accuracy performance drawback if computational speed and memory space usage are more important than performance

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

UMS Institutional Repository

The role of big data in smart city

Author: Abdalla
Abdullah Gani
Adeli
Ahmad
Al Nuaimi
Anthopoulos
Armbrust
Atzori
Batty
Bellavista
Bello-Orgaz
Borgia
bsi
Bughin
Caragliu
Chang
Chang
Chang
Chang
Chang
Chang
Chen
Chen
Chourabi
Dargie
Dean
Demirkan
Dobre
Ejaz Ahmed
Fan
Gani
George
Gouveia
Gubbi
Gulisano
Haruna Chiroma
Hashem
Hashem
Hawilo
Hollands
Ibrahim Abaker Targio Hashem
Ibrar Yaqoob
Jalali
Jimenez
Jin
Ju
Kang
Kayode Adewole
Khan
Kitchin
Kyriazis
Lai
Lee
Lohr
Manville
Meijer
Morabito
Neumeyer
Nor Badrul Anuar
Operators Network
Ortiz-Rangel
Osterwalder
Oualhaj
Patil
Rehman
Roy
S.o.C.M.a. Engineering
Shahrokni
Singh
STAFF
Su
Su
Tene
Tsai
Tsihrintzis
Victor Chang
Vilajosana
Willke
Wood
Yaqoob
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Credit card default prediction using machine learning techniques

Author: Alotaibi Faiz
Hashem Ibrahim Abaker Targio
Kasmiran Khairul Azhar
Sayjadah Yashna
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Credit risk plays a major role in the banking industry business. Banks' main activities involve granting loan, credit card, investment, mortgage, and others. Credit card has been one of the most booming financial services by banks over the past years. However, with the growing number of credit card users, banks have been facing an escalating credit card default rate. As such data analytics can provide solutions to tackle the current phenomenon and management credit risks. This paper provides a performance evaluation of credit card default prediction. Thus, logistic regression, rpart decision tree, and random forest are used to test the variable in predicting credit default and random forest proved to have the higher accuracy and area under the curve. This result shows that random forest best describe which factors should be considered with an accuracy of 82 % and an Area under Curve of 77 % when assessing the credit risk of credit card customers

Crossref

Universiti Putra Malaysia Institutional Repository

Distributed Join Query Processing for Big RDF Data

Author: Elzein Nahla Mohammed
Fakherldin Mohammed
Hashem Ibrahim Abaker Targio
Mazlina Abdul Majid
Publication venue: 'American Scientific Publishers'
Publication date: 01/11/2018
Field of study

The expansion of the services of the Semantic Web and the evolution of cloud computing technologies have significantly enhanced the capability of preserving and publishing information in standard open web formats, such that data can be both human-readable and machine-processable. This situation meets the challenge in the current big data era to effectively store, retrieve, and analyze resource description framework (RDF) data in swarms. Moreover, efficient data storage and retrieval that can scale to large amounts of possibly schema-less data have proven to be quite difficult to achieve, specifically, RDF data storage with complex and large graph patterns for representing semantic data, and SPARQL query languages. In this paper, we provide comprehensive discussion about the proposed algorithms of Join.Query processing of RDF data by considering MapReduce Framework in a distributed environment. Moreover, we introduced a framework for RDF query processing and the benchmark that is used for the performance evaluation. Finally, we offer an evaluation discussion on distributed join query processing for big RDF data

UMP Institutional Repository